###############################################################################
This file provides a guide to the code and data objects associated with 

Moore, Ryan T. and Sally A. Moore. ``Blocking for Sequential Political 
Experiments''.  Political Analysis, 2013.
###############################################################################

###############################################################################
Notes

1. The user should create a replication directory that includes three components:

  A. this file,
  B. a subdirectory /code/, which includes all .R files to create data and conduct analyses, and
  C. a subdirectory /data/, which includes all .RData files, CombinedData.csv, and horimatan07.txt        
      files.  These include
    i. intermediate .RData objects, so that many large, slow objects need not be recreated,
    ii. replication data for our PTSD experiment (pending Veterans' Administration approval),
    iii. replication data for two other applications (the HIT and CGQ papers),
  
2. Optionally, the user may create directory structures to recreate objects for which we provide summary intermediate objects.  Paper figures and quantities can be replicated directly from the intermediate objects we provide.  The optional directory structure should have two components:
  
  A. /data/simsMVN/, with empty subdirectories /r0/, /r6outlier2/, /r6outlier20/, /r6outlier35/, /r8/, and /r8bimodal/.  
  B. /data/simsMVNoutlier/, with empty subdirectories /outlier2/, /outlier20/, and /outlier35/.

3. This collection of files replicates the paper, roughly by following the order of files below.

4. Below, "SB" means "sequential blocking"; "CR" means "complete randomization".

5. The code assumes that the R working directory is the directory which is one level above /code/ and /data/.
###############################################################################


### Files that define functions and are sourced when needed: ###
1. mahal.R calculates matrix of Mahalanobis distances between all possible pairs
2. seqblock1.R collects metadata and assigns first experimental unit
3. seqblock2k.R collects data and assigns 2nd and subsequent experimental units
4. plotSBminusCR.R takes balance/precision summary object, plots comparisons
5. balanceCheck.R conducts and plots several balance tests
6. calcAIPWest.R calculates AIPW treatment effect estimates
7. calcBalancePrecision.R calculates balance and precision statistics
8. calcBPdirectory.R wraps calcBalancePrecision.R and applies it to dataframes in a directory
9. invertRIconfInt.R inverts the randomization inference test to produce non-parametric confidence intervals
10. ksTCrealLoop.R calculates the KS test p-value
11. simmer.R creates many simulated data sets
12. simulateOutlier.R creates data with outliers at positions 2, 20, 35
13. teEstimate.R estimates differences-in-means and AIPW causal effects (on adjusted data)


### Files that prepare data, perform analyses, and create plots: ###

14. simulateBalancePrecisionMVN.R
- (need not run; the next file sources an intermediate object.)
- creates MVN uncorr, MVN corr, MVN outlier, and MVN bimodal data sets (including SB assignment).
- calculates balance and precision statistics for each data set (including SB and CR assignments).
- to replicate, need only load("data/balprecAllMVN.RData")

15. plotBalPrecMVN.R
- loads balance and precision statistics calculated from simulateBalancePrecisionMVN.R (6 objects of 40400 observations each)
- plot balance and precision of all simulated data, organized two different ways.
- implementation via library(ggplot2)
- conducts outlier simulations (sources simulateOutlier.R)

16. conductRealisticSims.R
- creates many datasets with covariates anticipated in PTSD trial
- assigns treatments to datasets using several SB techniques

17. evaluateDesignFullData.R 
- loads PTSD trial data (pending Veterans' Administration approval)
- creates balance plots (dynamic, var-by-var, var-by-var p-vals)

18. horimatan07rep.R
- calculations from Horiuchi, Imai, Taniguchi (2007) data

19. cobgrequi11prep.R
- (need not run; sourced by next file cobgrequi11rep.R)
- prepares Cobb, Greiner, Quinn (2011) data

20. cobgrequi11rep.R
- calculations from Cobb, Greiner, Quinn (2011) data

21. pscoreOutcomeCalc.R
- estimates the propensity score for each PTSD unit using 10,000 simulated SB's.  
- creates pedagogical xtable()
- using teEstimate.R (which calls calcAIPWest.R),
   - calculates and plots AIPW and difference in means estimates for 5 outcomes
   - calculates and plots RMSEs

22. outcomeRIcalc.R
- calculates randomization inference confidence intervals
- uses invertRIconfInt.R


### Data files: ###
23. balprecAllMVN.RData (simulated): Balance and precision statistics for all SB and CR assignments
24. CombinedData.csv (application): Cobb, Greiner, Quinn (2011) data
25. d2pCGQ100.RData (simulated): Balance from SB rerandomizations of CGQ data
26. d2pHIT100.RData (simulated): Balance from SB rerandomizations of HIT data
27. estsCRfull.RData (application/simulated): Effect estimates from CR assignments of PTSD data
28. estsSBfull.RData (application/simulated): Effect estimates from SB assignments of PTSD data
29. horimatan07.txt (application): Horiuchi, Imai, Taniguchi (2007) data
30. outRIfakeFULLfine.RData (application/simulated): values included in inverted RI confidence intervals
31. pscoresFull10k.RData (application/simulated): assignment probabilities in 10,000 sequential blockings
32. SMRPreplication.RData (application): PTSD trial data, with adjusted outcomes (pending Veterans' Administration approval)
